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DETAILED ACTION 

Response to Arguments 

Applicant's arguments filed 5/9/2008 have been fully considered but they are not 
persuasive. Firstly applicant has amended independent claims to recite a "linked video 
file comprise of a pixel object file and a separate data object file, including information 
related to the object that corresponds to the selected pixel object, the data object file 
being linked to the corresponding pixel object file..." and wherein said linked video file 
is "configured to be exportable to a media player..." and has subsequently argued that 
Rangan in combination with Feinleib and Courtney fail to teach this limitation. Examiner 
respectfully disagrees. It should be noted that Rangan teaches pixel object files, i.e. a 
separate data stream is created which is not embedded in the video content. (Col 5 
lines 1-20, Col 6 lines 46-65, Col 9 lines 18-29, Col 10 lines 50-58) It should be noted 
that this separate data stream is synchronous to the original video stream and contains 
frame-by-frame coordinate data of the tracked image entity, (pixel object files). In 
addition Feinleib teaches data object files (enhancing content which is not embedded in 
video content). (Col 3 lines 51-65, Col 9 lines 27-39, Col 1 1 lines 17-27) Feinleib 
teaches a technique for synchronizing this enhancing content (closed captioning for 
example) with the primary content (video files) in such a manner which is independent 
of how these separate contents are delivered to the viewer computing units. Again 
applicant' specification merely recites data object files could be "overlay 
information... etc and thus, enhancing content such as closed captioning, animation, 
text, hypermedia, etc as taught by Feinleib may be enhancement content. 
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Additionally applicant has argued that Rangan in combination with Feinleib and 
Courtney fail to teach applicant's amendment of "...a linked video file is configured to be 
exportable to a media player." Examiner respectfully disagrees. It should be noted that 
this recitation is merely a statement of intended use which suggests or makes optional 
to export video files to a media player but does not require steps to be performed or 
does not limit a claim to a particular structure does not limit the scope of a claim or claim 
limitation . (See MPEP 2106 paragraph II, Section C). Furthermore it should be noted 
that Rangan explicitly teaches enhancing content (data object files) may reside on a 
computer disk or a CD-ROM which can be accessed during playing of the video file, 
thus suggesting that it would be possible to configure such data files (hyperlink for 
example) found in a CD-ROM to be exportable (transferred to) to a media player so that 
the files can be accessed. 

In addition, applicant has noted persons of ordinary skill in the art would not 
combine Rangan with Feinleib to produce the claimed invented since "this is has 
nothing to with the invention claimed in the subject application". Examiner notes that 
the motivation to combine references under 35 USC 103 does not have to be the same 
as that intended by the applicant nor does it have to solve the same problem applicant 
intends to solve in the art. Lastly, applicant has argued that while Courtney teaches a 
capture rate of 3 FPS, Courney's 315 frames are a sample not a predtermiend cluster 
and persons of ordinary skill in the art would not combine Rangan and Feinlieb with 
Courtney. Examiner respectfully disagrees. It should be noted that Courtney is utilized 
to simply show it was known in the art to sample video content at a sample rate which is 
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a divisor of plural standard playback rates. Again it should be noted that the video 
sampling rate as taught by Courtney is 3 frames per second. Further in regards to 
applicant's assertion that the motivation of Courtney has nothing to do with the present 
invention, again examiner notes the motivation to combine references under 35 USC 
103 does not have to be the same as that intended by the applicant nor does it have to 
solve the same problem applicant intends to solve in the art. 

Claim Rejections - 35 USC § 103 
The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

Claims 1-3, 11, 13-17, 19-27 and 29 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Rangan (6198833) in view of Feinleib (6637032) in further view 
of Courtney (6424370) 

Regarding claims 1 and 19, Rangan teaches an image processing system for 
processing video content in a sequence of video frames and linking a pixel object 
embedded in said video content to data corresponding to the pixel object in a sequence 
of video frames by explaining a system is provided for tracking a moving entity in a 
video presentation, the system comprising a computer station presenting the video 
presentation on a display as a series of bitmapped frames; and a tracking module 
receiving the video data stream. (Col 3, lines 26-29); said image processing system 
comprising a video capture system for capturing a frame of said sequence of video 



Application/Control Number: 10/786,777 Page 5 

Art Unit: 2628 

frames by showing a recording function for accepting the positions wherein the pixel 
signature (defined in the art as a local neighborhood around given pixel) most closely 
matches the image signature as the true positions of the image entity in the next 
frames. (Col 3, lines 43-46) and in FIG. 1 input data stream 15 to tracking module 13 is 
a stream of successive bitmapped frames in a normalized resolution, required by the 
tracking module. (Col 5, lines 35-37) The authoring station can be based on virtually 
any sort of computer platform and operating system, and in a preferred embodiment, a 
PC station running MS Windows is used, in which case the input stream 16, regardless 
of protocol, is converted to a digital video format that can be interpreted and played 
back as a sequence of bitmapped frames. (Col 5, lines 37-43) Furthermore Rangan 
teaches a user interface for enabling a user to select the pixel object in said captured 
frame. (Col 4 lines 1 1-35). Additionally Rangan teaches a pixel object tracking system, 
which includes a processor, which automatically tracks, said selected pixel objects in 
other frames. (Col 3, lines 26-50). It should be noted that it is well known in the art that 
a computer system would inherently contain a processor. Rangan also teaches said 
video linking system generating one or more linked video files, separate from said video 
content said linked video file comprising a pixel object file identifying the selected pixel 
object by frame number and location within the captured video frame and at least one 
subsequent video frame (Col 6 lines 48-51, Col 10 lines 53-66) by explaining when 
tracking element 29 (Fig. 2) is positioned and activated over an image entity to be 
tracked, a signature table is created and stored (Col 8, lines 40-42) and upon tracking 
element 29 being activated the tracking module creates a table or list comprising pixel 
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values associated with a target number and spatial arrangement of pixels associated 
with tracking element 29. (Col 7, lines 40-43). It should be noted Rangan further 
teaches providing one or more links to predetermined data objects for each pixel object. 
(Col 7 lines 25-52, Fig. 2) Reagan also teaches linked video files are synchronized with 
said video content. (Col 6, lines 48-51 and Col 10, lines 53-56) Furthermore Reagan 
teaches wherein said linked video files are configured so that selected locations in said 
video frames by a pointing device during playback of the video content can be linked 
with said data objects when said selected locations correspond said pixel objects. (Col 
7 lines 35-52) It should be noted that the point device as taught by Reagan is a mouse. 
Nonetheless, Rangan fails to explicitly teach a separate data object file that includes 
information related to the object that corresponds to the selected pixel object, the data 
object file being linked to the corresponding pixel object file. This is what Feinleib 
teaches. (Col 3 lines 51-65, Col 9 lines 27-39, Col 11 lines 17-27) It should be noted 
that Feinleib teaches enhancing content may reside in a viewers home and is 
synchronized by a closed caption script of the primary content with the synchronization 
independent of how and when the enchancing content or primary content is delivered 
to the viewer computing units. (Col 6 lines 23-30) It should be noted that Feinleib 
teaches linked data objects could be (configured to be) exportable to a media player as 
the data objects can reside on a computer disk or CD ROM. (Col 5 lines 40-44) Again, 
Feinleib explicitly teaches enhancing content can be delivered independently of the 
primary content and synchronized at the viewer-computing unit using the closed 
captioning script, which accompanies the primary content. (Col 9 lines 30-40). It would 
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have been obvious to one of ordinary skill in the art at the time the invention was made 
to combine the teachings a separate data object file that includes information related to 
the object linked to the video content into the system of Reagan because 
enhancements to primary content can be timely introduced at desired junctures of the 
primary content. (Col 2 lines 14-20) However neither Rangan nor Feinleib explicitly 
teaches a video linking system which samples video content at a sample rate which is a 
divisor of plural standard playback rates. This is what Courtney teaches (Col 15 line 28- 
Col 16 line 24, Fig. 24, Col 16 line 51 -Col 7 line 40) It should be noted that Courtney 
teaches 315 frames captured at approximately 3 frames per second (Col 16 lines 8-25) 
and it is well known in the art to utilize standard playback rates such as NTSC at 30 
FPS and FPS at 12 FPS. It would have been obvious to one of ordinary skill in the art 
at the time the invention was made to combine the teachings of sampling video content 
at 3 frames per second (a divisor of a plurality of standard playback rates) into the 
combination of Rangan and Feinleib because testing the quality of motion based event 
detection of differing frame rates (such as 3 frames per second) can be achieved and 
thus, providing more intelligent feedback regarding the occurrence of complex object 
actions such as inventory theft (Col 2 lines 59-63) can be realized. 

Claim 29 is similar in scope to claim 1 except for the recitation of clustering the 
sampled video content with plural frames per cluster. Courtney also teaches this (Col 
15 line 28-Col 16 line 24, Fig. 24, Col 16 line 51 -Col 7 line 40). For example, it should 
be noted that Courtney teaches a 315 frame cluster captured at 3 frames per second. It 
would have been obvious to one of ordinary skill in the art at the time the invention was 
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made to combine the teachings a 315 frame cluster sampled at 3 frames per second 
into the combination of Rangan and Feinleib because testing the quality of motion 
based event detection of differing frame rates (such as 3 frames per second) can be 
achieved and thus, providing more intelligent feedback regarding the occurrence of 
complex object actions such as inventory theft (Col 2 lines 59-63) can be realized. 

Regarding claims 17 and 27, Courtney teaches clustering the sampled video 
content with plural frames per cluster. (Col 15 line 28-Col 16 line 24, Fig. 24, Col 16 line 
51 -Col 7 line 40). For example, it should be noted that Courtney teaches a 315 frame 
cluster captured at 3 frames per second. It would have been obvious to one of ordinary 
skill in the art at the time the invention was made to combine the teachings a 315 frame 
cluster sampled at 3 frames per second into the combination of Rangan and Feinleib 
because testing the quality of motion based event detection of differing frame rates 
(such as 3 frames per second) can be achieved and thus, providing more intelligent 
feedback regarding the occurrence of complex object actions such as inventory theft 
(Col 2 lines 59-63) can be realized. 

Regarding claims 2 and 20, Courtney teaches sampling said video content at a 
sample rate of a divisor of 30 frames per second and 1 2 frames per second. (Col 1 5 
line 28-Col 16 line 24, Fig. 24, Col 16 line 51 -Col 7 line 40) It should be noted that 
Courtney teaches 315 frames captured at approximately 3 frames per second (Col 16 
lines 8-25) It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of sampling video content at 3 frames per 
second (a divisor of a plurality of standard playback rates) into the combination of 
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Rangan and Feinleib because testing the quality of motion based event detection of 
differing frame rates (such as 3 frames per second) can be achieved and thus, providing 
more intelligent feedback regarding the occurrence of complex object actions such as 
inventory theft (Col 2 lines 59-63) can be realized. 

Regarding claims 3 and 21 , Courtney teaches a sample rate of at least 3 frames 
per second. (Col 15 line 28-Col 16 line 24, Fig. 24, Col 16 line 51 -Col 7 line 40) It should 
be noted that Courtney teaches 315 frames captured at approximately 3 frames per 
second (Col 16 lines 8-25) It would have been obvious to one of ordinary skill in the art 
at the time the invention was made to combine the teachings of sampling video content 
at 3 frames per second (a divisor of a plurality of standard playback rates) into the 
combination of Rangan and Feinleib because testing the quality of motion based event 
detection of differing frame rates (such as 3 frames per second) can be achieved and 
thus, providing more intelligent feedback regarding the occurrence of complex object 
actions such as inventory theft (Col 2 lines 59-63) can be realized. 

Regarding claims 13, 16 and 23 and 26, Courtney teaches sampling said video 
content at a sample rate of a multiple of NTSC and PAL (movie) frame rates. (Col 15 
line 28-Col 16 line 24, Fig. 24, Col 16 line 51 -Col 7 line 40) It should be noted that 
Courtney teaches 315 frames captured at approximately 3 frames per second (Col 16 
lines 8-25) and it is well known in the art to utilize standard playback rates such as 
NTSC at 30 FPS and FPS at 12 FPS. It would have been obvious to one of ordinary 
skill in the art at the time the invention was made to combine the teachings of sampling 
video content at 3 frames per second (a divisor of a plurality of standard playback rates) 
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into the combination of Rangan and Feinleib because testing the quality of motion 
based event detection of differing frame rates (such as 3 frames per second) can be 
achieved and thus, providing more intelligent feedback regarding the occurrence of 
complex object actions such as inventory theft (Col 2 lines 59-63) can be realized. 

Regarding claims 14-15, 24-25 Courtney teaches sampling video content at a 
sample rate of a multiple of NTSC, PAL, 15 FPS and 12 FPS frame rates. (Col 15 line 
28-Col 16 line 24, Fig. 24, Col 16 line 51 -Col 7 line 40) It should be noted that Courtney 
teaches 315 frames captured at approximately 3 frames per second (Col 16 lines 8-25) 
and it is well known in the art to utilize standard playback rates such as NTSC at 30 
FPS and FPS at 12 FPS. It would have been obvious to one of ordinary skill in the art 
at the time the invention was made to combine the teachings of sampling video content 
at 3 frames per second (a divisor of a plurality of standard playback rates) into the 
combination of Rangan and Feinleib because testing the quality of motion based event 
detection of differing frame rates (such as 3 frames per second) can be achieved and 
thus, providing more intelligent feedback regarding the occurrence of complex object 
actions such as inventory theft (Col 2 lines 59-63) can be realized. 

Regarding claims 1 1 and 22, Rangan teaches including a video playback 
application for playing back video content and said linked video files, wherein said video 
playback application is configured to determine if locations selected by a pointing device 
during playback of the video content correspond to said predetermined pixel objects and 
provide a link to a data object when said selected location corresponds to one of said 
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determined pixel objects. (Col 7 lines 35-52) It should be noted that the point device as 

taught by Reagan is a mouse. 

Claims 4-5 are rejected under 35 U.S.C. 103(a) as being unpatentable over 

Rangan (6198833) in view of Feinleib (6637032) in further view of Courtney (6424370) 

and Toklu (6549643). 

Regarding claim 4, Rangan, Feinleib and Courtney do not explicitly teach said 
video linking system is configured to identify segment breaks in said video content. 
This is what Toklu teaches. Toklu teaches video summarization methods typically 
include segmenting a video into an appropriate set of segments such as video "shots" 
and selecting one or more key-frames from the shots. (Col 1 , lines 34-37) It should be 
noted that a key-frame is defined in the art to be a frame used to indicate the beginning 
or end of a change made to the signal and therefore, an implied segment break. It 
would have been obvious to one of ordinary skill in the art at the present time the 
invention was made to combine video summarization methods configured to identify 
segment breaks as taught by Toklu with the image processing system as taught by 
Rangan in order to reduce the number of images to one or more key-frames to 
represent the content of a given shot (Col 1 , lines 43-45) and thus, to generate a video 
summary. (Col 1 , line 33). 

Regarding claim 5, Rangan, Feinleib and Courtney do not explicitly teach said 
segment breaks are determined by determining the median average pixel values for a 
series of frames and comparing changes in the pixel values relative to the median 
average and indicating a segment break when the change in pixel values represents at 
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least a predetermined change relative to the median average. This is what Toklu 
teaches. Toklu teaches determining median average pixel values for a series of 
frames by showing computing an average of an absolute pixel-based intensity 
difference between consecutive frames in each segment, and for each segment, 
computing a cumulative sum of the average of the absolute pixel-based intensity 
differences for the corresponding frames of the segment. (Col 3, lines 61-67) Toklu 
also teaches comparing changes in pixel values relative to median average by 
explaining selecting the first frame in each motion activity segment of a given segment 
frame if the cumulative sum of the average of the absolute pixel-based intensity 
differences for the frames of the given segment does not exceed a first predefined 
threshold. (Col 4, lines 1-5) Lastly, Toklu teaches indicating a segment break when the 
change in pixel values represents at least a predetermined change relative to the 
median average by showing selecting a predefined number of key-frames in the given 
segment uniformly, if the cumulative sum of the average of the absolute pixel-based 
intensity differences for the frames of the given segment exceeds the first predefined 
threshold. (Col 4, lines 5-9) It should be noted that a key-frame is defined in the art to 
be a frame used to indicate the beginning or end of a change made to the signal and 
therefore an implied segment break. It would have been obvious to one of ordinary 
skill in the art at the present time the invention was made to combine determining the 
average pixel values for a series of frames, comparing changes in pixel values relative 
to the average and indicating a segment break when the change in pixel values 
represents at least a predetermined change relative to the median average as taught 
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by Toklu with the image processing system as taught by Rangan in order to measure a 
temporal activity curve for dissimilarity based on frame differences. (Col 3, lines 60-62) 
and thus, make possible in the system and method for selecting key-frames from video 
data. (Col 3, lines 51-59) 

Claims 18 and 28 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Rangan (6198833) in view of Feinleib (6637032) in further view of Courtney 
(6424370) and Toyama (5204749). 

Regarding claims 18 and 28, neither Rangan nor Feinleib explicitly teaches 
automatically determining changes in the characteristics of said one or more pixel 
objects based on upon changes in lighting and automatically compensating based upon 
those changes. This is what Toyama teaches. (Col 3 lines 59-62, Col 13 line 40-Col 14 
line 50, Fig. 9) It should be noted that Toyama teaches automatically detecting changes 
in the follow-up field of the object (automatically shifting color coordinate plane of values 
(R-Y/Y) and (B-Y/Y) from points AO, B0, CO to A1 , B1 , C1 [Fig. 9]). Furthermore it 
should be noted that Toyama teaches said changes in the color difference signals are 
based on (accounting for) changes in lighting and also permits stable follow-up by 
automatically compensating for variations in luminance of the illuminating light. It would 
have been obvious to one of ordinary skill in the art at the time the invention was made 
to combine the teachings of automatically determining changes of one or more pixel 
objects based upon changes in lighting and automatically compensating for those 
changes into the system of Rangan because prevention of each of the points on the 
coordinate system from coming close to the origin or moving farther way from the origin 
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(due to luminance of light varying with time) while the object is not moving can be 
realized (Col 15 lines 48-54) and thus, stably performing a follow-up operation in despite 
of variations in luminance of illuminating light (Col 3 lines 59-62) can be achieved. 
Conclusion 

THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time 
policy as set forth in 37 CFR 1 .136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1 .136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the mailing date of this final action. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Kevin K. Xu whose telephone number is 571-272-7747. 
The examiner can normally be reached on 8:30AM - 5:00 PM. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Mark Zimmerman can be reached on 571-272-7653. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 

/Kee M Tung/ 

Supervisory Patent Examiner, Art Unit 2628 



/Kevin KXu/ 
Examiner, Art Unit 2628 
8/1/08 



IK. K. X./ 



